Let's explore the song names in the files that are in the Wikifonia dump from 2010.

The folder wikifonia20100503 comes from a dump of the wikifonia database found here:

https://github.com/jganseman/musq

First, let's list all those files:


In [1]:
import glob

In [4]:
fnames = glob.glob("../MusicXML_files/wikifonia20100503/*.xml")
fnames[:10]


Out[4]:
['../MusicXML_files/wikifonia20100503\\100.xml',
 '../MusicXML_files/wikifonia20100503\\1000.xml',
 '../MusicXML_files/wikifonia20100503\\1002.xml',
 '../MusicXML_files/wikifonia20100503\\1003.xml',
 '../MusicXML_files/wikifonia20100503\\1007.xml',
 '../MusicXML_files/wikifonia20100503\\1008.xml',
 '../MusicXML_files/wikifonia20100503\\101.xml',
 '../MusicXML_files/wikifonia20100503\\1011.xml',
 '../MusicXML_files/wikifonia20100503\\1019.xml',
 '../MusicXML_files/wikifonia20100503\\102.xml']

Now, for every file, let's read it and extract the title.


In [5]:
import xml.etree.cElementTree as ET

In [12]:
tree = ET.ElementTree(file=fnames[0])

In [16]:
root = tree.getroot()
root


Out[16]:
<Element 'score-partwise' at 0x00000000054C96D8>

In [17]:
root.getchildren()


Out[17]:
[<Element 'work' at 0x00000000054C9818>,
 <Element 'movement-number' at 0x00000000054C9908>,
 <Element 'movement-title' at 0x00000000054C9958>,
 <Element 'identification' at 0x00000000054C99A8>,
 <Element 'part-list' at 0x00000000054C9D18>,
 <Element 'part' at 0x00000000054C9F98>,
 <Element 'part' at 0x00000000054DD638>,
 <Element 'part' at 0x00000000054F2CC8>]

In [21]:
root.find('identification/creator').text


Out[21]:
'Miles Davis'

In [24]:
root.find('movement-title').text


Out[24]:
'All Blues'

Let's write a function.


In [28]:
def title_composer(fname):
    "Returns title and composer name from XML filename."
    root = ET.ElementTree(file=fname).getroot()
    return (root.find('identification/creator').text, root.find('movement-title').text)

In [29]:
title_composer(fnames[0])


Out[29]:
('Miles Davis', 'All Blues')

Let's now run a loop over all our data:


In [31]:
metadata = [title_composer(fname) for fname in fnames]

Finally, let's build a pandas dataframe using this data:


In [32]:
import pandas as pd

In [35]:
df = pd.DataFrame(data=metadata, index=fnames, columns=('composer', 'song_title'))
df.head(10)


Out[35]:
composer song_title
../MusicXML_files/wikifonia20100503\100.xml Miles Davis All Blues
../MusicXML_files/wikifonia20100503\1000.xml Ukrainian folksong Gehe nicht, oh Gregor
../MusicXML_files/wikifonia20100503\1002.xml Frederik Vahle Schlaflied für Anne
../MusicXML_files/wikifonia20100503\1003.xml Belle and Sebastian Get me away
../MusicXML_files/wikifonia20100503\1007.xml Belle and Sebastian Storytelling
../MusicXML_files/wikifonia20100503\1008.xml Howard Greenfield, Neal Sedaka Where the Boys Are
../MusicXML_files/wikifonia20100503\101.xml Charlie Parker Anthropology
../MusicXML_files/wikifonia20100503\1011.xml Ned Washington The Nearness of You
../MusicXML_files/wikifonia20100503\1019.xml John Lennon, Yoko Ono Happy Xmas
../MusicXML_files/wikifonia20100503\102.xml The Rolling Stones Paint it black

In [48]:
df.shape


Out[48]:
(2265, 2)

We can now easily filter some of the data. For instance all titles from the Rolling Stones.


In [46]:
df[df.composer.str.contains('rolling', case=False)]


Out[46]:
composer song_title
../MusicXML_files/wikifonia20100503\102.xml The Rolling Stones Paint it black

In [47]:
df[df.composer.str.contains('stone', case=False)]


Out[47]:
composer song_title
../MusicXML_files/wikifonia20100503\102.xml The Rolling Stones Paint it black
../MusicXML_files/wikifonia20100503\3785.xml Harry Stone, Jack Stapp Chattanoogie Shoe Shine Boy
../MusicXML_files/wikifonia20100503\3787.xml Jay Livingstone, Ray Evans To Each His Own
../MusicXML_files/wikifonia20100503\3866.xml Nelson, Touchstone Just Because
../MusicXML_files/wikifonia20100503\4865.xml Mack David, Al Hoffman, Jerry Livingstone Chi-baba chi-baba
../MusicXML_files/wikifonia20100503\5416.xml Merle Travis, Cliffie Stone, Eddie Kirk So Firm, So Round, So Fully Packed

In [49]:
df[df.composer.str.contains('keith', case=False)]


Out[49]:
composer song_title
../MusicXML_files/wikifonia20100503\1099.xml Keith Richards, Mick Jagger Angie
../MusicXML_files/wikifonia20100503\3765.xml Ben Peters, Vivian Keith Before The Next Teardrop Falls
../MusicXML_files/wikifonia20100503\5303.xml Keith Richards, Mick Jagger Honky-Tonk Woman

In [50]:
df[df.composer.str.contains('lennon', case=False)]


Out[50]:
composer song_title
../MusicXML_files/wikifonia20100503\1019.xml John Lennon, Yoko Ono Happy Xmas
../MusicXML_files/wikifonia20100503\120.xml John Lennon, Paul McCartney Lady Madonna
../MusicXML_files/wikifonia20100503\151.xml John Lennon, Paul McCartney Eleanor Rigby
../MusicXML_files/wikifonia20100503\1904.xml John Lennon, Paul McCartney A Hard Day's Night
../MusicXML_files/wikifonia20100503\2430.xml Paul McCartney, John Lennon Michelle
../MusicXML_files/wikifonia20100503\2519.xml John Lennon, Paul McCartney Yesterday
../MusicXML_files/wikifonia20100503\3012.xml John Lennon, Paul McCartney All My Loving
../MusicXML_files/wikifonia20100503\3153.xml John Lennon and Paul McCartney I saw her standing there
../MusicXML_files/wikifonia20100503\3154.xml John Lennon, Paul McCartney I saw her standing there
../MusicXML_files/wikifonia20100503\3155.xml John Lennon, Paul McCartney Ticket to ride
../MusicXML_files/wikifonia20100503\3413.xml John Lennon, Paul McCartney Do You Want To Know A Secret
../MusicXML_files/wikifonia20100503\3485.xml John Lennon, Paul McCartney Ob-La-Di Ob-La-Da
../MusicXML_files/wikifonia20100503\368.xml John Lennon, Paul McCartney A Hard Day's Night
../MusicXML_files/wikifonia20100503\3689.xml John Lennon, Paul McCartney Love Me Do
../MusicXML_files/wikifonia20100503\3834.xml John Lennon, Paul McCartney I Want To Hold Your Hand
../MusicXML_files/wikifonia20100503\3864.xml John Lennon, Paul McCartney And I Love Her
../MusicXML_files/wikifonia20100503\3874.xml John Lennon, Paul McCartney The Fool On The Hill
../MusicXML_files/wikifonia20100503\3911.xml John Lennon, Paul McCartney Penny Lane
../MusicXML_files/wikifonia20100503\3975.xml John Lennon, Paul McCartney You Won't See Me
../MusicXML_files/wikifonia20100503\3980.xml John Lennon, Paul McCartney Please Please Me
../MusicXML_files/wikifonia20100503\4074.xml John Lennon, Paul McCartney Yellow Submarine
../MusicXML_files/wikifonia20100503\4288.xml John Lennon, Paul McCartney Norwegian Wood
../MusicXML_files/wikifonia20100503\4400.xml John Lennon, Paul McCartney All You Need Is Love
../MusicXML_files/wikifonia20100503\4740.xml John Lennon, Paul McCartney Eight Days A Week
../MusicXML_files/wikifonia20100503\4741.xml John Lennon, Paul McCartney From Me To You
../MusicXML_files/wikifonia20100503\4742.xml John Lennon, Paul McCartney The Long And Winding Road
../MusicXML_files/wikifonia20100503\4750.xml John Lennon, Paul McCartney Hey Jude
../MusicXML_files/wikifonia20100503\5017.xml John Lennon, Yoko Ono Happy Xmas
../MusicXML_files/wikifonia20100503\530.xml John Lennon Imagine

In [ ]: